Overview

Dataset statistics

Number of variables13
Number of observations2988650
Missing cells1876559
Missing cells (%)4.8%
Duplicate rows37207
Duplicate rows (%)1.2%
Total size in memory925.2 MiB
Average record size in memory324.6 B

Variable types

Numeric8
Categorical4
Text1

Alerts

Dataset has 37207 (1.2%) duplicate rowsDuplicates
brand is highly overall correlated with cat1 and 1 other fieldsHigh correlation
cat1 is highly overall correlated with brand and 1 other fieldsHigh correlation
cat2 is highly overall correlated with brand and 1 other fieldsHigh correlation
cust_request_tn is highly overall correlated with customer_id and 2 other fieldsHigh correlation
customer_id is highly overall correlated with cust_request_tn and 1 other fieldsHigh correlation
product_id is highly overall correlated with cust_request_tn and 2 other fieldsHigh correlation
sku_size is highly overall correlated with product_idHigh correlation
tn is highly overall correlated with cust_request_tn and 2 other fieldsHigh correlation
plan_precios_cuidados is highly imbalanced (91.0%)Imbalance
stock_final has 1839319 (61.5%) missing valuesMissing
cust_request_tn is highly skewed (γ1 = 37.70988076)Skewed
tn is highly skewed (γ1 = 37.87580231)Skewed
stock_final has 34082 (1.1%) zerosZeros

Reproduction

Analysis started2024-05-26 16:59:46.057374
Analysis finished2024-05-26 17:10:43.619881
Duration10 minutes and 57.56 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

periodo
Real number (ℝ)

Distinct36
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean201801.71
Minimum201701
Maximum201912
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.8 MiB
2024-05-26T17:10:43.773946image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum201701
5-th percentile201702
Q1201709
median201805
Q3201903
95-th percentile201910
Maximum201912
Range211
Interquartile range (IQR)194

Descriptive statistics

Standard deviation81.458276
Coefficient of variation (CV)0.00040365503
Kurtosis-1.4831421
Mean201801.71
Median Absolute Deviation (MAD)97
Skewness0.087900502
Sum6.0311468 × 1011
Variance6635.4507
MonotonicityIncreasing
2024-05-26T17:10:44.046997image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
201803 102366
 
3.4%
201706 97420
 
3.3%
201705 97377
 
3.3%
201710 94046
 
3.1%
201708 93856
 
3.1%
201711 93590
 
3.1%
201709 92545
 
3.1%
201703 91966
 
3.1%
201903 90974
 
3.0%
201806 90667
 
3.0%
Other values (26) 2043843
68.4%
ValueCountFrequency (%)
201701 80143
2.7%
201702 83688
2.8%
201703 91966
3.1%
201704 84185
2.8%
201705 97377
3.3%
201706 97420
3.3%
201707 75644
2.5%
201708 93856
3.1%
201709 92545
3.1%
201710 94046
3.1%
ValueCountFrequency (%)
201912 58782
2.0%
201911 74666
2.5%
201910 77774
2.6%
201909 82457
2.8%
201908 67794
2.3%
201907 83040
2.8%
201906 83414
2.8%
201905 74731
2.5%
201904 79021
2.6%
201903 90974
3.0%

customer_id
Real number (ℝ)

HIGH CORRELATION 

Distinct597
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10171.395
Minimum10001
Maximum10637
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.8 MiB
2024-05-26T17:10:44.345398image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum10001
5-th percentile10007
Q110053
median10133
Q310267
95-th percentile10447
Maximum10637
Range636
Interquartile range (IQR)214

Descriptive statistics

Standard deviation142.03964
Coefficient of variation (CV)0.013964617
Kurtosis-0.22512637
Mean10171.395
Median Absolute Deviation (MAD)98
Skewness0.81392422
Sum3.0398741 × 1010
Variance20175.26
MonotonicityNot monotonic
2024-05-26T17:10:44.655949image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10001 25122
 
0.8%
10004 24487
 
0.8%
10003 24100
 
0.8%
10002 23553
 
0.8%
10007 23469
 
0.8%
10018 22645
 
0.8%
10027 22634
 
0.8%
10059 21983
 
0.7%
10005 21518
 
0.7%
10034 19705
 
0.7%
Other values (587) 2759434
92.3%
ValueCountFrequency (%)
10001 25122
0.8%
10002 23553
0.8%
10003 24100
0.8%
10004 24487
0.8%
10005 21518
0.7%
10006 18419
0.6%
10007 23469
0.8%
10008 9192
 
0.3%
10009 16749
0.6%
10010 11464
0.4%
ValueCountFrequency (%)
10637 2
 
< 0.1%
10636 5
 
< 0.1%
10635 51
 
< 0.1%
10634 16
 
< 0.1%
10633 2
 
< 0.1%
10632 2
 
< 0.1%
10631 21
 
< 0.1%
10630 65
 
< 0.1%
10629 8
 
< 0.1%
10626 187
< 0.1%

product_id
Real number (ℝ)

HIGH CORRELATION 

Distinct1233
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20423.189
Minimum20001
Maximum21299
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.8 MiB
2024-05-26T17:10:44.970273image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum20001
5-th percentile20023
Q120155
median20360
Q320650
95-th percentile21014
Maximum21299
Range1298
Interquartile range (IQR)495

Descriptive statistics

Standard deviation312.94155
Coefficient of variation (CV)0.015322854
Kurtosis-0.6238578
Mean20423.189
Median Absolute Deviation (MAD)237
Skewness0.59229109
Sum6.1037765 × 1010
Variance97932.413
MonotonicityNot monotonic
2024-05-26T17:10:45.284676image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20037 9996
 
0.3%
20100 9812
 
0.3%
20020 9706
 
0.3%
20230 9360
 
0.3%
20010 9222
 
0.3%
20021 8782
 
0.3%
20105 8204
 
0.3%
20022 8008
 
0.3%
20111 7973
 
0.3%
20122 7950
 
0.3%
Other values (1223) 2899637
97.0%
ValueCountFrequency (%)
20001 6172
0.2%
20002 6000
0.2%
20003 6793
0.2%
20004 7139
0.2%
20005 5911
0.2%
20006 6497
0.2%
20007 6906
0.2%
20008 6453
0.2%
20009 5596
0.2%
20010 9222
0.3%
ValueCountFrequency (%)
21299 1
< 0.1%
21298 1
< 0.1%
21297 1
< 0.1%
21296 1
< 0.1%
21295 1
< 0.1%
21294 1
< 0.1%
21293 1
< 0.1%
21292 1
< 0.1%
21291 1
< 0.1%
21290 2
< 0.1%

plan_precios_cuidados
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size165.3 MiB
0
2954621 
1
 
34029

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2988650
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2954621
98.9%
1 34029
 
1.1%

Length

2024-05-26T17:10:45.579380image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-26T17:10:45.828326image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 2954621
98.9%
1 34029
 
1.1%

Most occurring characters

ValueCountFrequency (%)
0 2954621
98.9%
1 34029
 
1.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2988650
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2954621
98.9%
1 34029
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2988650
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2954621
98.9%
1 34029
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2988650
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2954621
98.9%
1 34029
 
1.1%

cust_request_qty
Real number (ℝ)

Distinct84
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1495019
Minimum1
Maximum92
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.8 MiB
2024-05-26T17:10:46.067416image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile7
Maximum92
Range91
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.5805763
Coefficient of variation (CV)1.6657702
Kurtosis54.039778
Mean2.1495019
Median Absolute Deviation (MAD)0
Skewness6.3281174
Sum6424109
Variance12.820527
MonotonicityNot monotonic
2024-05-26T17:10:46.366835image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 2066593
69.1%
2 451619
 
15.1%
3 152223
 
5.1%
4 83170
 
2.8%
5 47223
 
1.6%
6 32179
 
1.1%
7 23616
 
0.8%
8 18792
 
0.6%
9 14345
 
0.5%
10 11975
 
0.4%
Other values (74) 86915
 
2.9%
ValueCountFrequency (%)
1 2066593
69.1%
2 451619
 
15.1%
3 152223
 
5.1%
4 83170
 
2.8%
5 47223
 
1.6%
6 32179
 
1.1%
7 23616
 
0.8%
8 18792
 
0.6%
9 14345
 
0.5%
10 11975
 
0.4%
ValueCountFrequency (%)
92 1
< 0.1%
90 1
< 0.1%
88 1
< 0.1%
85 2
< 0.1%
84 1
< 0.1%
83 1
< 0.1%
79 1
< 0.1%
78 1
< 0.1%
77 1
< 0.1%
76 1
< 0.1%

cust_request_tn
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct101954
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.47691053
Minimum0.0001
Maximum551.56137
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.8 MiB
2024-05-26T17:10:46.664980image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0.0001
5-th percentile0.00209
Q10.01057
median0.04095
Q30.1638
95-th percentile1.604051
Maximum551.56137
Range551.56127
Interquartile range (IQR)0.15323

Descriptive statistics

Standard deviation3.276818
Coefficient of variation (CV)6.8709283
Kurtosis2789.3242
Mean0.47691053
Median Absolute Deviation (MAD)0.03658
Skewness37.709881
Sum1425318.6
Variance10.737536
MonotonicityNot monotonic
2024-05-26T17:10:46.947176image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.01638 19921
 
0.7%
0.04095 16229
 
0.5%
0.00218 15964
 
0.5%
0.00819 15171
 
0.5%
0.0819 14603
 
0.5%
0.00983 14272
 
0.5%
0.03276 14052
 
0.5%
0.02457 13014
 
0.4%
0.01092 12684
 
0.4%
0.00491 12504
 
0.4%
Other values (101944) 2840236
95.0%
ValueCountFrequency (%)
0.0001 170
 
< 0.1%
0.00013 79
 
< 0.1%
0.00018 159
 
< 0.1%
0.0002 238
 
< 0.1%
0.00021 628
< 0.1%
0.00022 104
 
< 0.1%
0.00023 744
< 0.1%
0.00025 299
< 0.1%
0.00026 262
 
< 0.1%
0.00029 137
 
< 0.1%
ValueCountFrequency (%)
551.56137 1
< 0.1%
510.65893 1
< 0.1%
444.41192 1
< 0.1%
439.90647 1
< 0.1%
437.37767 1
< 0.1%
416.64823 1
< 0.1%
407.02225 1
< 0.1%
393.26092 1
< 0.1%
389.02653 1
< 0.1%
384.82574 1
< 0.1%

tn
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct101922
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.46684062
Minimum0.0001
Maximum547.87849
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.8 MiB
2024-05-26T17:10:47.235802image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0.0001
5-th percentile0.00209
Q10.01052
median0.0409
Q30.1638
95-th percentile1.58995
Maximum547.87849
Range547.87839
Interquartile range (IQR)0.15328

Descriptive statistics

Standard deviation3.1598884
Coefficient of variation (CV)6.7686664
Kurtosis2850.397
Mean0.46684062
Median Absolute Deviation (MAD)0.03653
Skewness37.875802
Sum1395223.2
Variance9.9848948
MonotonicityNot monotonic
2024-05-26T17:10:47.510485image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.01638 19931
 
0.7%
0.04095 16228
 
0.5%
0.00218 15965
 
0.5%
0.00819 15181
 
0.5%
0.0819 14608
 
0.5%
0.00983 14272
 
0.5%
0.03276 14070
 
0.5%
0.02457 13013
 
0.4%
0.01092 12686
 
0.4%
0.00491 12502
 
0.4%
Other values (101912) 2840194
95.0%
ValueCountFrequency (%)
0.0001 170
 
< 0.1%
0.00013 79
 
< 0.1%
0.00018 159
 
< 0.1%
0.0002 238
 
< 0.1%
0.00021 628
< 0.1%
0.00022 104
 
< 0.1%
0.00023 746
< 0.1%
0.00025 299
< 0.1%
0.00026 262
 
< 0.1%
0.00029 137
 
< 0.1%
ValueCountFrequency (%)
547.87849 1
< 0.1%
469.45761 1
< 0.1%
439.90647 1
< 0.1%
437.37767 1
< 0.1%
430.90803 1
< 0.1%
414.05146 1
< 0.1%
389.02653 1
< 0.1%
386.60688 1
< 0.1%
384.82574 1
< 0.1%
379.4427 1
< 0.1%

cat1
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing7448
Missing (%)0.2%
Memory size169.7 MiB
PC
1657313 
HC
746562 
FOODS
571148 
REF
 
6179

Length

Max length5
Median length2
Mean length2.576822
Min length2

Characters and Unicode

Total characters7682027
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHC
2nd rowHC
3rd rowHC
4th rowHC
5th rowHC

Common Values

ValueCountFrequency (%)
PC 1657313
55.5%
HC 746562
25.0%
FOODS 571148
 
19.1%
REF 6179
 
0.2%
(Missing) 7448
 
0.2%

Length

2024-05-26T17:10:47.800764image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-26T17:10:48.168282image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
pc 1657313
55.6%
hc 746562
25.0%
foods 571148
 
19.2%
ref 6179
 
0.2%

Most occurring characters

ValueCountFrequency (%)
C 2403875
31.3%
P 1657313
21.6%
O 1142296
14.9%
H 746562
 
9.7%
F 577327
 
7.5%
D 571148
 
7.4%
S 571148
 
7.4%
R 6179
 
0.1%
E 6179
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 7682027
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
C 2403875
31.3%
P 1657313
21.6%
O 1142296
14.9%
H 746562
 
9.7%
F 577327
 
7.5%
D 571148
 
7.4%
S 571148
 
7.4%
R 6179
 
0.1%
E 6179
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 7682027
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
C 2403875
31.3%
P 1657313
21.6%
O 1142296
14.9%
H 746562
 
9.7%
F 577327
 
7.5%
D 571148
 
7.4%
S 571148
 
7.4%
R 6179
 
0.1%
E 6179
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 7682027
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
C 2403875
31.3%
P 1657313
21.6%
O 1142296
14.9%
H 746562
 
9.7%
F 577327
 
7.5%
D 571148
 
7.4%
S 571148
 
7.4%
R 6179
 
0.1%
E 6179
 
0.1%

cat2
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)< 0.1%
Missing7448
Missing (%)0.2%
Memory size184.2 MiB
CABELLO
813398 
DEOS
510270 
SOPAS Y CALDOS
344693 
ROPA LAVADO
266667 
HOGAR
223478 
Other values (10)
822696 

Length

Max length19
Median length14
Mean length7.6951468
Min length2

Characters and Unicode

Total characters22940787
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVAJILLA
2nd rowVAJILLA
3rd rowVAJILLA
4th rowVAJILLA
5th rowVAJILLA

Common Values

ValueCountFrequency (%)
CABELLO 813398
27.2%
DEOS 510270
17.1%
SOPAS Y CALDOS 344693
11.5%
ROPA LAVADO 266667
 
8.9%
HOGAR 223478
 
7.5%
PIEL2 209945
 
7.0%
ADEREZOS 204671
 
6.8%
VAJILLA 155239
 
5.2%
PIEL1 90819
 
3.0%
ROPA ACONDICIONADOR 82492
 
2.8%
Other values (5) 79530
 
2.7%

Length

2024-05-26T17:10:48.593910image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
cabello 813398
20.2%
deos 510270
12.7%
ropa 359332
8.9%
sopas 344693
8.6%
y 344693
8.6%
caldos 344693
8.6%
lavado 266667
 
6.6%
hogar 223478
 
5.5%
piel2 209945
 
5.2%
aderezos 204671
 
5.1%
Other values (8) 408080
10.1%

Most occurring characters

ValueCountFrequency (%)
O 3375272
14.7%
A 3360801
14.6%
L 2890792
12.6%
E 2081347
9.1%
S 1789490
7.8%
D 1524166
6.6%
C 1333248
 
5.8%
1048718
 
4.6%
P 1013302
 
4.4%
R 900270
 
3.9%
Other values (14) 3623381
15.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22940787
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
O 3375272
14.7%
A 3360801
14.6%
L 2890792
12.6%
E 2081347
9.1%
S 1789490
7.8%
D 1524166
6.6%
C 1333248
 
5.8%
1048718
 
4.6%
P 1013302
 
4.4%
R 900270
 
3.9%
Other values (14) 3623381
15.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22940787
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
O 3375272
14.7%
A 3360801
14.6%
L 2890792
12.6%
E 2081347
9.1%
S 1789490
7.8%
D 1524166
6.6%
C 1333248
 
5.8%
1048718
 
4.6%
P 1013302
 
4.4%
R 900270
 
3.9%
Other values (14) 3623381
15.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22940787
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
O 3375272
14.7%
A 3360801
14.6%
L 2890792
12.6%
E 2081347
9.1%
S 1789490
7.8%
D 1524166
6.6%
C 1333248
 
5.8%
1048718
 
4.6%
P 1013302
 
4.4%
R 900270
 
3.9%
Other values (14) 3623381
15.8%

cat3
Text

Distinct93
Distinct (%)< 0.1%
Missing7448
Missing (%)0.2%
Memory size185.9 MiB
2024-05-26T17:10:49.165692image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length19
Median length16
Mean length7.8008947
Min length3

Characters and Unicode

Total characters23256043
Distinct characters51
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCristalino
2nd rowCristalino
3rd rowCristalino
4th rowCristalino
5th rowCristalino
ValueCountFrequency (%)
shampoo 380777
 
10.8%
aero 337515
 
9.6%
acondicionador 308574
 
8.8%
polvo 153986
 
4.4%
liquido 126363
 
3.6%
sopas 121289
 
3.4%
jabon 110439
 
3.1%
mayonesa 108842
 
3.1%
gel 102749
 
2.9%
noaero 81470
 
2.3%
Other values (88) 1694129
48.0%
2024-05-26T17:10:50.744579image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2097836
 
9.0%
O 1879074
 
8.1%
A 1782653
 
7.7%
a 1527350
 
6.6%
e 1131100
 
4.9%
C 992719
 
4.3%
r 987568
 
4.2%
S 848721
 
3.6%
N 791113
 
3.4%
l 789103
 
3.4%
Other values (41) 10428806
44.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 23256043
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 2097836
 
9.0%
O 1879074
 
8.1%
A 1782653
 
7.7%
a 1527350
 
6.6%
e 1131100
 
4.9%
C 992719
 
4.3%
r 987568
 
4.2%
S 848721
 
3.6%
N 791113
 
3.4%
l 789103
 
3.4%
Other values (41) 10428806
44.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 23256043
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 2097836
 
9.0%
O 1879074
 
8.1%
A 1782653
 
7.7%
a 1527350
 
6.6%
e 1131100
 
4.9%
C 992719
 
4.3%
r 987568
 
4.2%
S 848721
 
3.6%
N 791113
 
3.4%
l 789103
 
3.4%
Other values (41) 10428806
44.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 23256043
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 2097836
 
9.0%
O 1879074
 
8.1%
A 1782653
 
7.7%
a 1527350
 
6.6%
e 1131100
 
4.9%
C 992719
 
4.3%
r 987568
 
4.2%
S 848721
 
3.6%
N 791113
 
3.4%
l 789103
 
3.4%
Other values (41) 10428806
44.8%

brand
Categorical

HIGH CORRELATION 

Distinct37
Distinct (%)< 0.1%
Missing7448
Missing (%)0.2%
Memory size180.3 MiB
NIVEA
384335 
SHAMPOO3
338209 
MAGGI
322839 
DEOS1
299785 
MUSCULO
242680 
Other values (32)
1393354 

Length

Max length10
Median length9
Mean length6.3284487
Min length3

Characters and Unicode

Total characters18866384
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowImportado
2nd rowImportado
3rd rowImportado
4th rowImportado
5th rowImportado

Common Values

ValueCountFrequency (%)
NIVEA 384335
12.9%
SHAMPOO3 338209
11.3%
MAGGI 322839
10.8%
DEOS1 299785
10.0%
MUSCULO 242680
 
8.1%
LIMPIEX 217199
 
7.3%
SHAMPOO2 141777
 
4.7%
NATURA 120648
 
4.0%
SHAMPOO1 109610
 
3.7%
COLBERT 89406
 
3.0%
Other values (27) 714714
23.9%

Length

2024-05-26T17:10:51.247807image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
nivea 384335
12.9%
shampoo3 338209
11.3%
maggi 322839
10.8%
deos1 299785
10.1%
musculo 242680
 
8.1%
limpiex 217199
 
7.3%
shampoo2 141777
 
4.8%
natura 120648
 
4.0%
shampoo1 109610
 
3.7%
colbert 89406
 
3.0%
Other values (27) 714714
24.0%

Most occurring characters

ValueCountFrequency (%)
O 2259859
12.0%
A 2147786
 
11.4%
M 1578291
 
8.4%
I 1449134
 
7.7%
E 1399987
 
7.4%
S 1373798
 
7.3%
P 948177
 
5.0%
L 781952
 
4.1%
N 770778
 
4.1%
G 727286
 
3.9%
Other values (25) 5429336
28.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 18866384
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
O 2259859
12.0%
A 2147786
 
11.4%
M 1578291
 
8.4%
I 1449134
 
7.7%
E 1399987
 
7.4%
S 1373798
 
7.3%
P 948177
 
5.0%
L 781952
 
4.1%
N 770778
 
4.1%
G 727286
 
3.9%
Other values (25) 5429336
28.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 18866384
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
O 2259859
12.0%
A 2147786
 
11.4%
M 1578291
 
8.4%
I 1449134
 
7.7%
E 1399987
 
7.4%
S 1373798
 
7.3%
P 948177
 
5.0%
L 781952
 
4.1%
N 770778
 
4.1%
G 727286
 
3.9%
Other values (25) 5429336
28.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 18866384
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
O 2259859
12.0%
A 2147786
 
11.4%
M 1578291
 
8.4%
I 1449134
 
7.7%
E 1399987
 
7.4%
S 1373798
 
7.3%
P 948177
 
5.0%
L 781952
 
4.1%
N 770778
 
4.1%
G 727286
 
3.9%
Other values (25) 5429336
28.8%

sku_size
Real number (ℝ)

HIGH CORRELATION 

Distinct75
Distinct (%)< 0.1%
Missing7448
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean445.277
Minimum1
Maximum10000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.8 MiB
2024-05-26T17:10:51.547395image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10
Q190
median240
Q3450
95-th percentile1000
Maximum10000
Range9999
Interquartile range (IQR)360

Descriptive statistics

Standard deviation741.1227
Coefficient of variation (CV)1.6644082
Kurtosis39.527448
Mean445.277
Median Absolute Deviation (MAD)160
Skewness5.1284545
Sum1.3274607 × 109
Variance549262.86
MonotonicityNot monotonic
2024-05-26T17:10:51.887704image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
200 306300
 
10.2%
400 214118
 
7.2%
350 200788
 
6.7%
90 173479
 
5.8%
50 155471
 
5.2%
10 125059
 
4.2%
750 122149
 
4.1%
100 120666
 
4.0%
300 103069
 
3.4%
800 90204
 
3.0%
Other values (65) 1369899
45.8%
ValueCountFrequency (%)
1 14556
 
0.5%
2 21958
 
0.7%
3 3010
 
0.1%
4 20122
 
0.7%
5 49197
 
1.6%
6 14872
 
0.5%
8 18309
 
0.6%
10 125059
4.2%
12 30376
 
1.0%
15 28855
 
1.0%
ValueCountFrequency (%)
10000 3052
 
0.1%
7500 32
 
< 0.1%
5000 14532
 
0.5%
4500 38
 
< 0.1%
4000 18498
 
0.6%
3000 80012
2.7%
2000 7768
 
0.3%
1800 508
 
< 0.1%
1500 12842
 
0.4%
1400 3906
 
0.1%

stock_final
Real number (ℝ)

MISSING  ZEROS 

Distinct12596
Distinct (%)1.1%
Missing1839319
Missing (%)61.5%
Infinite0
Infinite (%)0.0%
Mean27.139183
Minimum-27.31136
Maximum1562.0245
Zeros34082
Zeros (%)1.1%
Negative28122
Negative (%)0.9%
Memory size22.8 MiB
2024-05-26T17:10:52.189260image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-27.31136
5-th percentile0
Q11.76106
median7.17641
Q323.05673
95-th percentile111.2202
Maximum1562.0245
Range1589.3358
Interquartile range (IQR)21.29567

Descriptive statistics

Standard deviation74.750983
Coefficient of variation (CV)2.7543564
Kurtosis114.17328
Mean27.139183
Median Absolute Deviation (MAD)6.70467
Skewness8.9532795
Sum31191904
Variance5587.7094
MonotonicityNot monotonic
2024-05-26T17:10:52.492237image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 34082
 
1.1%
0.049 727
 
< 0.1%
0.11394 521
 
< 0.1%
0.7204 470
 
< 0.1%
3.42342 468
 
< 0.1%
-1.57248 450
 
< 0.1%
0.04423 447
 
< 0.1%
17.26234 445
 
< 0.1%
-0.01747 440
 
< 0.1%
27.70186 432
 
< 0.1%
Other values (12586) 1110849
37.2%
(Missing) 1839319
61.5%
ValueCountFrequency (%)
-27.31136 206
< 0.1%
-13.66656 65
 
< 0.1%
-13.33127 196
< 0.1%
-8.19961 64
 
< 0.1%
-8.15986 86
 
< 0.1%
-7.7212 24
 
< 0.1%
-5.86579 65
 
< 0.1%
-5.28091 94
 
< 0.1%
-5.18307 242
< 0.1%
-5.0992 51
 
< 0.1%
ValueCountFrequency (%)
1562.02448 221
< 0.1%
1284.38214 158
< 0.1%
1212.36734 158
< 0.1%
1146.09799 213
< 0.1%
1097.55623 149
< 0.1%
1057.38804 189
< 0.1%
1037.85386 186
< 0.1%
1031.01561 176
< 0.1%
978.16446 46
 
< 0.1%
916.3419 215
< 0.1%

Interactions

2024-05-26T17:10:23.691307image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:47.897334image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:52.583844image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:58.404760image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:03.206662image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:08.461192image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:13.779932image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:18.473433image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:24.285223image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:48.510270image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:53.199224image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:59.020721image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:03.821644image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:09.158343image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:14.381013image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:19.132923image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:24.909418image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:49.096997image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:53.864816image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:59.629039image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:04.463475image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:09.872335image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:14.993505image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:19.775909image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:25.313821image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:49.705282image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:54.522322image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:00.287382image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:05.061969image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:10.689021image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:15.614398image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:20.433311image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:25.920413image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:50.274732image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:55.221229image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:00.905526image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:05.644824image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:11.512339image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:16.215264image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:21.063138image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:26.318789image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:50.886107image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:56.032031image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:01.498336image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:06.261391image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:12.111283image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:16.807794image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:21.724635image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:26.723417image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:51.529672image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:56.963911image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:02.139119image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:06.919509image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:12.766012image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:17.425570image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:22.575371image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:27.120842image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:51.918292image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:09:57.574839image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:02.538397image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:07.777917image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:13.162002image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:17.828899image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-05-26T17:10:23.125955image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2024-05-26T17:10:52.756261image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
brandcat1cat2cust_request_qtycust_request_tncustomer_idperiodoplan_precios_cuidadosproduct_idsku_sizestock_finaltn
brand1.0001.0000.8340.0030.075-0.002-0.0010.228-0.0750.2010.0250.075
cat11.0001.0001.0000.023-0.2570.0560.0130.0410.307-0.077-0.314-0.258
cat20.8341.0001.000-0.0120.092-0.035-0.0240.120-0.106-0.0460.1110.092
cust_request_qty0.0030.023-0.0121.0000.376-0.452-0.0110.003-0.0080.009-0.0100.376
cust_request_tn0.075-0.2570.0920.3761.000-0.512-0.0300.000-0.5920.4720.3241.000
customer_id-0.0020.056-0.035-0.452-0.5121.000-0.0220.006-0.007-0.031-0.007-0.512
periodo-0.0010.013-0.024-0.011-0.030-0.0221.0000.0270.008-0.001-0.025-0.030
plan_precios_cuidados0.2280.0410.1200.0030.0000.0060.0271.0000.010-0.048-0.017-0.020
product_id-0.0750.307-0.106-0.008-0.592-0.0070.0080.0101.000-0.552-0.443-0.592
sku_size0.201-0.077-0.0460.0090.472-0.031-0.001-0.048-0.5521.0000.3540.472
stock_final0.025-0.3140.111-0.0100.324-0.007-0.025-0.017-0.4430.3541.0000.324
tn0.075-0.2580.0920.3761.000-0.512-0.030-0.020-0.5920.4720.3241.000

Missing values

2024-05-26T17:10:28.097238image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-26T17:10:31.840384image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-26T17:10:40.598384image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

periodocustomer_idproduct_idplan_precios_cuidadoscust_request_qtycust_request_tntncat1cat2cat3brandsku_sizestock_final
02017011023420524020.053000.05300HCVAJILLACristalinoImportado500.0NaN
12017011003220524010.136280.13628HCVAJILLACristalinoImportado500.0NaN
22017011021720524010.030280.03028HCVAJILLACristalinoImportado500.0NaN
32017011012520524010.022710.02271HCVAJILLACristalinoImportado500.0NaN
420170110012205240111.544521.54452HCVAJILLACristalinoImportado500.0NaN
52017011008020524010.015140.01514HCVAJILLACristalinoImportado500.0NaN
62017011001520524040.106000.10600HCVAJILLACristalinoImportado500.0NaN
72017011006220524010.189280.18928HCVAJILLACristalinoImportado500.0NaN
82017011015920524030.022710.02271HCVAJILLACristalinoImportado500.0NaN
92017011018320524010.015140.01514HCVAJILLACristalinoImportado500.0NaN
periodocustomer_idproduct_idplan_precios_cuidadoscust_request_qtycust_request_tntncat1cat2cat3brandsku_sizestock_final
29886402019121002120853080.158290.15829PCCABELLOShampoo BebeNIVEA200.01.82373
29886412019121009320853010.055740.05574PCCABELLOShampoo BebeNIVEA200.01.82373
29886422019121000320853090.624260.62426PCCABELLOShampoo BebeNIVEA200.01.82373
29886432019121036720853010.004460.00446PCCABELLOShampoo BebeNIVEA200.01.82373
29886442019121027820853050.060200.06020PCCABELLOShampoo BebeNIVEA200.01.82373
29886452019121010520853010.022300.02230PCCABELLOShampoo BebeNIVEA200.01.82373
29886462019121009220853010.006690.00669PCCABELLOShampoo BebeNIVEA200.01.82373
29886472019121000620853070.028980.02898PCCABELLOShampoo BebeNIVEA200.01.82373
29886482019121001820853040.015610.01561PCCABELLOShampoo BebeNIVEA200.01.82373
29886492019121002020853020.015610.01561PCCABELLOShampoo BebeNIVEA200.01.82373

Duplicate rows

Most frequently occurring

periodocustomer_idproduct_idplan_precios_cuidadoscust_request_qtycust_request_tntncat1cat2cat3brandsku_sizestock_final# duplicates
02017011000120010031.319141.31914HCROPA LAVADOPolvoLIMPIEX400.0NaN2
12017011000120021031.878241.87824HCROPA LAVADOPolvoLIMPIEX400.0NaN2
2201701100012002201015.3578915.35789HCROPA LAVADOPolvoLIMPIEX800.0NaN2
32017011000120037065.402785.40278FOODSSOPAS Y CALDOSCaldo CuboMAGGI12.0NaN2
42017011000120105086.950366.95036FOODSSOPAS Y CALDOSSalsas WetMAGGI350.0NaN2
5201701100022001001657.7711756.09386HCROPA LAVADOPolvoLIMPIEX400.0NaN2
6201701100022002001229.2481329.24813HCROPA LAVADOPolvoLIMPIEX800.0NaN2
7201701100022002102237.4949136.21072HCROPA LAVADOPolvoLIMPIEX400.0NaN2
8201701100022002201142.0026942.00269HCROPA LAVADOPolvoLIMPIEX800.0NaN2
9201701100022003702615.8099815.11284FOODSSOPAS Y CALDOSCaldo CuboMAGGI12.0NaN2